biotoolscmgfandomcom-20200214-history
Unix
Unix cheat sheet This is a very simplified and rough introduction to using a terminal on a unix machine. The unix command line interface is a very powerful environment and there is much more to it than described here. This document describes: *Some useful concepts *A brief overview of the command line shell *Directories and the file system *Working with files *Reading the contents of text files *Invoking executables *Redirection and pipes *Permissions and access Some useful concepts This is a a brief overview of some useful concepts in unix The Shell: Although Unix has a graphical interface called X Windows, it's often easier and quicker to run programs by typing commands into a terminal window. Access to unix from other operating systems is usually conducted through a terminal client e.g. Putty for windows. * Users: All programs are run as a specific user, so you have to log into the system as that user with a password * Files and processes: Everything is a file or a process and the input and output from files and processes can be sent to each other (see pipes and redirection). * Permissions: All files, directories and programs have access permissions. A user cannot see the contents of a file or run a program unless the permissions allow A brief overview of the command line shell This is what is run when you open a terminal window. It provides a lot of information and tools to help you run programs. The command prompt: When you open a terminal, the text at the bottom of the screen next to the cursor will look something like this: interactionmaq:/home/projects/MicrobialGenomicsGroup> This is useful because it tells you who you are and where you are. It can be configured in different ways but the example above shows the machine, the username in square brackets '[ ]' and the current directory after the colon ':'. Command line history and auto-completion: * Previously entered commands can be edited or executed again using the up arrow key * Filenames can be auto-completed using the tab key * When you login to a machine or terminal, a set of variables, collectively called the environment are created. These variables do things like telling the shell where to look for programs. * The printenv command will list all the environment variables. The list can be quite long. printenv SHELL This command prints the executable for the current shell. Unix offers a selection of many shells all of which are subtly different printenv HOME This command prints the location of the users home directory printenv PATH The PATH variable is particularly important. It consists of a list of directories which are searched when a command it typed. It is often useful to edit the PATH variable to add directories where executables are stored. Getting help: many unix commands have one or more manpage (manual page) . Try typing man commandname to see this. Directories and the file system The shell logs you into a directory in the file system. There are some rules for about the file system. files, directories and exectuables are case sensitive: so x.txt and X.txt are two different files path delimiter: The unix shell uses the forward slash '/' to seperate files and directories NOT the backslash '\' like MsDOS There are some special characters which are used when defining the location of a file, directory or program: * The root directory: The root directory is defined by the single slash '/' and represents to the first node of the directory tree. It is similar to 'C:' on a windows machine. * The '.' directory: The '.' character is used to define the current directory when it's part of a file path. * The home directory: Most users are assigned a home directory where files can be created. This can be referenced using the '~\ character. * absolute versus relative paths: Absolute paths are defined from the root directory e.g. /usr/bin/perl Relative paths can be defined from the current directory e.g. ./script.pl Relative paths can be defined from the home directory e.g. ~/script.pl Here are some useful command to assist getting around the filesystem: *pwd: prints the current, or "working" directory *cd: changes the current working directory to a new location cd /usr/bin *mkdir: creates a new directory * mkdir newdir *rmdir: removes a directory, although the directory must be empty so may not contain any subdirectories or files rmdir newdir Working with files Files reside in directories and can contain text or binary information. Files be created, copied, moved, renamed and deleted with the following commands. ls: lists the contents of directories. Run just as 'ls' the command lists all the contents without any other information. More information can be gained by supplying some arguments. list showing permissions, user, group. size, modification date and filename ls -l as above but file sizes are printed in human readable form ls -lh as above but sort the results by file size in descending order ls -lhS as above but in ascending order ls -lhSr list the contents of all directories recursively from the current directory ls -lR list files in ascending order of last modified ls -lt as above but in descending order ls -ltr touch: used with a filename. If the file doesn't already exist, a new one is created. Otherwise the date of the file is changed to the current time touch filename cp: copies filename1 to filename2 or into a directory and leaves the original file untouched. It can also be used to copy directories cp original_file new_file cp file directory/ cp -r directory new_directory mv: moves a file from one to another and deletes the original file. It can also be used to recursively move directories mv original_file new_file mv file directory/ mv dir1 dir2 rm: deletes a file, can also be used to delete a directory and the contents. use with care. rm file rm -r dir File name advice: It's best not to use spaces or special characters such as " ' < > $ @ $ in filenames. Underscores '_' and hypens '-' are fine Reading the contents of text files The contents of text files (but not binary files) can be read quite easily through the terminal. cat: appends the contents of one file into another cat file1 file2 more: shows the contents of a text file. Press 'q' to return to prompt more filename *less: better than more because the up and down keys can be used to scroll up and down. Some useful key commands are *space: scroll forward one screen *b: scroll backward one screen *g: scroll to the top of the file *G: scroll to the end of the file *q: quit to the prompt */text: searches the file for the word 'text' less filename head: shows the first few lines of the given file(s). A hyphen and number can be passed to determine how much of the file is shown head filename head -5 filename tail: shows the last few lines of the given file(s). A hyphen and number can be passed to determine how much of the file is shown tail filename tail -5 filename grep: Searches through text files for a search term and print matching lines. grep is a complex command and has many options. to search two files for a searchword grep searchword file1 file2 to search two files to print lines excluding a searchword grep -v searchterm file1 file2 To count the number of matches per file grep -c searchterm file1 file2 wc: counts the number of words or lines a file contains print the wordcount of a file wc -w file.txt print the line count of a file wc -l file.txt Invoking executables Some files can be marked as executable which means they can run as programs and perform tasks. Invoking executables residing in the PATH directories: These executables can be invoked without including the path to the executable. So even though a program called perl can be found in /usr/bin, because this is normally part on the PATH it can called as perl myscript.pl Invoking non-PATH executables: These must be called giving an absolute or relative path to the program /usr/bin/perl myscript.pl getting help: there are a number of ways of getting help in unix but they can vary a lot * man: many commands have a manual page which can be viewed by "man ls" * whatis: may provide a one-line description of the command "whatis wc" * apropos: returns a list of commands with the keyword in their manual page * arguments: executables take arguments e.g. filenames to set options and to define files to be operated one. options are usually prefixed by one or two hyphens * -h or --help: * wildcards : the asterisk * matches any character(s) whilst the ? character matches exactly one character Redirection and pipes Most unix processes write their output to the standard output (the terminal screen) and take their input from standard input (they keyboard). There is also a standard error which also the usually the terminal screen. These inputs and outputs can be redirected to files. Unfortunately the details of redirection can vary slightly depending on which shell program is being executed. Redirecting output: The strandard output can be output using the > character ls -l > list.txt Appending to a file echo 'hello' >> list.txt Redirecting input sort < filewithdata.txt Both at the same time sort < input.txt > output.txt Pipes '|' allow processes to direct output and directly to other files. Pipes can be put together to allow complex "pipelines" of commands to be put together The output of command one can be passed to command 2 as follows ls -1 | grep -v '*.txt' | grep -c '*.coli*' File system permissions File system permissions ensure that file contents or executables can only be examined or invoked by users with the correct authorisation. They can often also be a source of problems when using data or programs created by others where you can't access a file or directory due to the permissions set. To view permissions type ls -l in a directory containing some files, the output will look like this: -rw-r----- 1 user1 cdrom 11802 Jul 9 10:02 file.txt The 10 character string "-rw-r-----" describes the permissions. The hyphen indicates that the permission has not been granted and r indicates read permission, w indicates write and x indicates that the file can be executed. There are three sets of r,w,x or - to control access by the user to whome the file belongs, members of the same group and anyone else. e.g. -rwxrwxrwx means anyone can read, write and execute the file * whoami: will display your username * groups: will display the groups you are a member of * chmod: file permissions can be changed using the chmod command This command takes a complex set of arguments. * The user, group or other are represented by u,g and o. a is used to represent all * whether permissions are granted or reveoked is determined using + or - respectively * read, write or execute permissions are represented by r,w and x So to remove write and execute permissions for the group and others try chmod go-wx data.txt to give everyone read and write permissions chmod a+rw data.txt How to count number of columns in tab separated file Sometimes you may need to know how many columns are in a text file. This command will give that number using tab as a separator (\t). To change the separator to ;'' change the expression ''FS="\t" to FS=";": awk 'BEGIN {FS="\t"} ; END{print NF}'